Reducing the Number of Training Samples for Fast Support Vector Machine Classification

نویسندگان

Ravindra Koggalage

Saman Halgamuge

چکیده

Support Vector Machines (SVMs) have gained wide acceptance because of the high generalization ability for a wide range of classification applications. Although SVMs have shown potential and promising performance in classification, they have been limited by speed particularly when the training data set is large. The hyper plane constructed by SVM is dependent on only a portion of the training samples called support vectors that lie close to the decision boundary (hyper plane). Thus, removing any training samples that are not relevant to support vectors might have no effect on building the proper decision function. We propose the use of clustering techniques such as K-mean to find initial clusters that are further altered to identify non-relevant samples in deciding the decision boundary for SVM. This will help to reduce the number of training samples for SVM without degrading the classification result.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Interpretation of UltraCam Imagery by Combination of Support Vector Machine and Knowledge-based Systems

With the development of digital sensors, an increasing number of high-resolution images are available. Interpretation of these images is not possible manually, which necessitates seeking for practical, fast and automatic solutions to solve the environmental and location-based management problems. The land cover classification using high-resolution imagery is a difficult process because of the c...

متن کامل

High performance of the support vector machine in classifying hyperspectral data using a limited dataset

To prospect mineral deposits at regional scale, recognition and classification of hydrothermal alteration zones using remote sensing data is a popular strategy. Due to the large number of spectral bands, classification of the hyperspectral data may be negatively affected by the Hughes phenomenon. A practical way to handle the Hughes problem is preparing a lot of training samples until the size ...

متن کامل

Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets

Biomedical datasets usually include a large number of features relative to the number of samples. However, some data dimensions may be less relevant or even irrelevant to the output class. Selection of an optimal subset of features is critical, not only to reduce the processing cost but also to improve the classification results. To this end, this paper presents a hybrid method of filter and wr...

متن کامل

Robustified distance based fuzzy membership function for support vector machine classification

Fuzzification of support vector machine has been utilized to deal with outlier and noise problem. This importance is achieved, by the means of fuzzy membership function, which is generally built based on the distance of the points to the class centroid. The focus of this research is twofold. Firstly, by taking the advantage of robust statistics in the fuzzy SVM, more emphasis on reducing the im...

متن کامل

A New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate

Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Reducing the Number of Training Samples for Fast Support Vector Machine Classification

نویسندگان

چکیده

منابع مشابه

Automatic Interpretation of UltraCam Imagery by Combination of Support Vector Machine and Knowledge-based Systems

High performance of the support vector machine in classifying hyperspectral data using a limited dataset

Fast SFFS-Based Algorithm for Feature Selection in Biomedical Datasets

Robustified distance based fuzzy membership function for support vector machine classification

A New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate

عنوان ژورنال:

اشتراک گذاری